Protein folds in the worm genome.

نویسندگان

  • M Gerstein
  • J Lin
  • H Hegyi
چکیده

We survey the protein folds in the worm genome, using pairwise and multiple-sequence comparison methods (i.e. FASTA and PSI-blast). Overall, we find that approximately 250 folds match approximately 8000 domains in approximately 4500 ORFs, about 32 matches per fold involving a quarter of the total worm ORFs. We compare the folds in the worm genome to those in other model organisms, in particular yeast and E. coli, and find that the worm shares more folds with the phylogenetically closer yeast than with E. coli. There appear to be 36 folds unique to the worm compared to these two model organisms, and many of these are obviously implicated in aspects of multicellularity. The most common fold in the worm genome is the immunoglobulin fold, and many of the common folds are repeated in various combinations and permutations in multidomain proteins. In addition, an approach is presented for the identification of "sure" and "marginal" membrane proteins. When applied to the worm genome, this reveals a much greater relative prevalence of proteins with seven transmembrane helices in comparison to the other completely sequenced genomes, which are not of metazoans. Combining these analyses with some other simple filters allows one to identify ORFs that potentially code for soluble proteins of unknown fold, which may be promising targets for experimental investigation in structural genomics. A regularly updated worm fold analysis will be available from bioinfo.mbb.yale.edu/genome/worm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dry Matter and Crude Protein Degradability ofMopane Worm (Imbrasia belina) in Rumen of Steers

Three cannulated Tswana steers were used to investigate the rumen degradability of mopane worm (Imbrasia belina) by measuring the amount of dry matter (DM) and crude protein (CP) disappearing at incubation periods up to 72 h. The effective degradability (ED) of DM and CP in the rumen was calculated at outflow rates of 0.03 / h (ED0.03) and 0.05 / h(ED0.05). Rumen degradable CP (RDP) was estimat...

متن کامل

Structural Genomics Analysis: Characteristics of Atypical, Typical, and Horizontally Transferred Folds

We conducted a structural genomics analysis of the folds and structural superfamilies in the first 20 completely sequenced genomes by focusing on the patterns of fold usage and trying to identify structural characteristics of typical and atypical folds. We assigned folds to sequences using PSI-blast, run with a systematic protocol to reduce the amount of computational overhead. On average, fold...

متن کامل

Proteins: Structure, Function, and Genetics Author Instructions Checklist Adobe Acrobat Users -notes Tool Sheet Reprint Order Form Return Fax Form a Copy of Your Page Proofs for Your Article Proteins: Structure, Function, and Genetics

We conducted a structural genomics analysis of the folds and structural superfamilies in the first 20 completely sequenced genomes by focusing on the patterns of fold usage and trying to identify structural characteristics of typical and atypical folds. We assigned folds to sequences using PSI-blast, run with a systematic protocol to reduce the amount of computational overhead. On average, fold...

متن کامل

PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information.

As the number of protein folds is quite limited, a mode of analysis that will be increasingly common in the future, especially with the advent of structural genomics, is to survey and re-survey the finite parts list of folds from an expanding number of perspectives. We have developed a new resource, called PartsList, that lets one dynamically perform these comparative fold surveys. It is availa...

متن کامل

Digging for Dead Genes: An Analysis of the Characteristics and Distribution of the Pseudogene Population in the Ribbon Worm Genome

Pseudogenes are non-functioning copies of genes in genomic DNA, which may either result from reverse transcription from a messenger RNA transcript (termed processed pseudogenes) or from gene duplication and subsequent disablement (non-processed pseudogenes). As pseudogenes are apparently ‘dead’, they usually have a variety of disablements (e.g. insertions, deletions, frameshifts and truncations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

دوره   شماره 

صفحات  -

تاریخ انتشار 2000